-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Python: Promote XXE and XML-bomb queries #8634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
After internal discussion, these will replace the `XmlEntityInjection` query, so we can have separate severities on DoS and the other (more serious) attacks. Note: These clearly don't work, since they are verbatim copies of the JS code, but I split it into multiple commits to clearly highlight what changes were made.
I changed a few QLdocs so they fit the style we have used in Python... although I surely do regret having introduced a new style for how these QLDocs look :D
Note that most of the testing happens in the framework specific tests, with an inline-expectation test
Which doesn't raise that syntax error (at least not on my laptop)
Kept the test of SimpleXmlRpcServer, and kept the qhelp so it can be used to write the new qhelp files
and remove the old copy, we don't need it anymore :)
I found this resource quite good myself at least :)
Since there are other XML vulnerabilities that are not about parsing, this is more correct.
I forgot about the existing ones when I promoted it
I didn't find a good way to actually share the stuff, so we kinda just have 2 things that look very similar :|
And it's not possible to provide a parser argument either
- `XMLEtree` to `XmlEtree` - `XMLSax` to `XmlSax` - `LXML` to `Lxml` - `XMLParser` to `XmlParser`
This also means that the detection of the values passed to these keyword arguments will no longer just be from a local scope, but can also be across function boundaries.
|
I tried to see whether |
|
Performance evaluation looks good 👍 |
Originally made by @erik-krogh in https://github.com/github/codeql/pull/8693/files#diff-9627c1fb9a1cc77fb93e6b7e31af1a4fa908f2a60362cfb34377d24debb97398 Could not be applied directly to this PR, since this PR deletes the file.
yoff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good thorough job. Only minor comments. It looks like you deleted a poc (this_directory_not_extracted)?
| */ | ||
| private class FileAccessFromLxmlParsing extends LxmlParsing, FileSystemAccess::Range { | ||
| FileAccessFromLxmlParsing() { | ||
| this = API::moduleImport("lxml").getMember("etree").getMember(["parse", "parseid"]).getACall() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered if a utility to get the function name could make this smoother (like MethodCallNode has).
Since the old PoC was located with the framework testing code, and the framework testing code was moved to multiple folders, I decided to create a new top-level directory to contain that code: https://github.com/github/codeql/blob/714465bf39d97e31aa6f0a7aa01c57e16f3c3078/python/PoCs/XmlParsing/PoC.py -- I guess I should have mentioned this in a comment 😳 I'm not sure it's the best solution ever, since this means that the framework testing code is far removed from this PoC, but I also couldn't come up with a better solution 🤔 |
Co-authored-by: yoff <lerchedahl@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found 5 vulnerabilities.
| private DataFlow::TypeTrackingNode instance(DataFlow::TypeTracker t) { | ||
| t.start() and | ||
| result instanceof InstanceSource | ||
| or | ||
| exists(DataFlow::TypeTracker t2 | result = instance(t2).track(t2, t)) | ||
| } |
Check warning
Code scanning / CodeQL
Dead code
| } | ||
|
|
||
| /** Gets a reference to an instance of `io.StringIO`/`io.BytesIO`. */ | ||
| DataFlow::Node instance() { instance(DataFlow::TypeTracker::end()).flowsTo(result) } |
Check warning
Code scanning / CodeQL
Dead code
|
Should have fixed the QL4QL alerts now as well 👍 |
yoff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not want to drag this out over little refactoring, but since you now did the one for Lxml (that I recall considering), I want to suggest the only other one, I considered. (I think the two branches of elementTreeInstance are fine to keep apart, since they have different semantics).
Co-authored-by: yoff <lerchedahl@gmail.com>
yoff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This was a draft PR to ensure we don't merge it before we have agreed with JS/Ruby about the new
XmlParsingconcept, but they have said good for it, so we're ready to go 👍With this PR, I have copied the query setup from JS (Ruby only has XXE query). This means that no query will alert on 'DTD retrieval', which I will postpone for future work (this PR is already quite big).